(list item) tag:
First list item
Berners-Lee, Connolly, et. al. Page 33
HTML 2.0 February 8, 1995
Second list item
Third list item
2.14 Other Elements
2.14.1 Paragraph
Level 0
The Paragraph element indicates a paragraph. The exact
indentation, leading, etc. of a paragraph is not defined
and may be a function of other tags, style sheets, etc.
Typically, paragraphs are surrounded by a vertical space
of one line or half a line. This is typically not the
case within the Address element and or is never the case
within the Preformatted Text element. With some HTML
user agents, the first line in a paragraph is indented.
Example of use:
This Heading Precedes the Paragraph
This is the text of the first paragraph.
This is the text of the second paragraph. Although you
do not need to start paragraphs on new lines, maintaining
this convention facilitates document maintenance.
This is the text of a third paragraph.
2.14.2 Preformatted Text
...
Level 0
The Preformatted Text element presents blocks of text in
fixed-width font, and so is suitable for text that has
been formatted on screen.
The tag may be used with the optional WIDTH
attribute, which is a Level 1 feature. The WIDTH
attribute specifies the maximum number of characters for
a line and allows the HTML user agent to select a
suitable font and indentation. If the WIDTH attribute is
not present, a width of 80 characters is assumed. Where
the WIDTH attribute is supported, widths of 40, 80 and
132 characters should be presented optimally, with other
widths being rounded up.
Berners-Lee, Connolly, et. al. Page 34
HTML 2.0 February 8, 1995
Within preformatted text:
- Line breaks within the text are rendered as a move
to the beginning of the next line.
- The tag should not be used. If found, it should
be rendered as a move to the beginning of the next line.
- Anchor elements and character highlighting elements
may be used.
- Elements that define paragraph formatting
(headings, address, etc.) must not be used.
- The horizontal tab character (encoded in US-ASCII
and ISO-8859-1 as decimal 9) must be
interpreted as the smallest positive nonzero number of
spaces which will leave the number of characters so far
on the line as a multiple of 8. Its use is not
recommended however.
NOTE: References to the "beginning of a new line" do not
imply that the renderer is forbidden from using a
constant left indent for rendering preformatted text.
The left indent may be constrained by the width
required.
Example of use:
This is an example line.
NOTE: Within a Preformatted Text element, the constraint
that the rendering must be on a fixed horizontal
character pitch may limit or prevent the ability of the
HTML user agent to render highlighting elements
specially.
2.14.3 Line Break
Level 0
The Line Break element specifies that a new line must be
started at the given point. A new line indents the same
Berners-Lee, Connolly, et. al. Page 35
HTML 2.0 February 8, 1995
as that of line-wrapped text.
Example of use:
Pease porridge hot
Pease porridge cold
Pease porridge in the pot
Nine days old.
2.14.4 Horizontal Rule
Level 0
A Horizontal Rule element is a divider between sections
of text such as a full width horizontal rule or
equivalent graphic.
Example of use:
February 8, 1995, CERN
2.15 Form Elements
Forms are created by placing input fields within
paragraphs, preformatted/literal text, and lists. This
gives considerable flexibility in designing the layout
of forms.
The following elements (all are HTML 2 features) are
used to create forms:
FORM
A form within a document.
INPUT
One input field.
OPTION
One option within a Select element.
SELECT
Berners-Lee, Connolly, et. al. Page 36
HTML 2.0 February 8, 1995
A selection from a finite set of options.
TEXTAREA
A multi-line input field.
Each variable field is defined by an Input, Textarea, or
Option element and must have an NAME attribute to
identify its value in the data returned when the form is
submitted.
Example of use (a questionnaire form):
Sample Questionnaire
Please fill out this questionnaire:
In the example above, the and
tags have been
used to lay out the text and input fields. The HTML user
agent is responsible for handling which field will
currently get keyboard input.
Many platforms have existing conventions for forms, for
example, using Tab and Shift keys to move the keyboard
focus forwards and backwards between fields, and using
the Enter key to submit the form. In the example, the
SUBMIT and RESET buttons are specified explicitly with
special purpose fields. The SUBMIT button is used to e-
mail the form or send its contents to the server as
specified by the ACTION attribute, while RESET resets
the fields to their initial values. When the form
consists of a single text field, it may be appropriate
to leave such buttons out and rely on the Enter key.
The Input element is used for a large variety of types
of input fields.
Berners-Lee, Connolly, et. al. Page 37
HTML 2.0 February 8, 1995
To let users enter more than one line of text, use the
Textarea element.
2.15.1 Representing Choices
The radio button and checkbox types of input field can
be used to specify multiple choice forms in which every
alternative is visible as part of the form. An
alternative is to use the Select element which is
typically rendered in a more compact fashion as a pull
down combo list.
2.15.2 Form
...
Level 2
The Form element is used to delimit a data input form.
There can be several forms in a single document, but the
Form element can't be nested.
The ACTION attribute is a URL specifying the location to
which the contents of the form is submitted to elicit a
response. If the ACTION attribute is missing, the URL of
the document itself is assumed. The way data is
submitted varies with the access protocol of the URL,
and with the values of the METHOD and ENCTYPE
attributes.
In general:
- the METHOD attribute selects variations in the
protocol.
- the ENCTYPE attribute specifies the format of the
submitted data in case the protocol does not impose a
format itself.
The Level 2 specification defines and requires support
for the HTTP access protocol only.
When the ACTION attribute is set to an HTTP URL, the
METHOD attribute must be set to an HTTP method as
defined by the HTTP method specification in the IETF
draft HTTP standard. The default METHOD is GET, although
for many applications, the POST method may be preferred.
With the post method, the ENCTYPE attribute is a MIME
type specifying the format of the posted data; by
Berners-Lee, Connolly, et. al. Page 38
HTML 2.0 February 8, 1995
default, is application/x-www-form-urlencoded.
Under any protocol, the submitted contents of the form
logically consist of name/value pairs. The names are
usually equal to the NAME attributes of the various
interactive elements in the form.
NOTE: The names are not guaranteed to be unique keys,
nor are the names of form elements required to be
distinct. The values encode the user's input to the
corresponding interactive elements. Elements capable of
displaying a textual or numerical value will return a
name/value pair even when they receive no explicit user
input.
2.15.3 Input
Level 2
The Input element represents a field whose contents may
be edited by the user.
Attributes of the Input element:
ALIGN
Vertical alignment of the image. For use only with
TYPE=IMAGE in HTML level 2. The possible values are
exactly the same as for the ALIGN attribute of the image
element.
CHECKED
Indicates that a checkbox or radio button is selected.
Unselected checkboxes and radio buttons do not return
name/value pairs when the form is submitted.
MAXLENGTH
Indicates the maximum number of characters that can be
entered into a text field. This can be greater than
specified by the SIZE attribute, in which case the field
will scroll appropriately. The default number of
characters is unlimited.
NAME
Symbolic name used when transferring the form's
Berners-Lee, Connolly, et. al. Page 39
HTML 2.0 February 8, 1995
contents. The NAME attribute is required for most input
types and is normally used to provide a unique
identifier for a field, or for a logically related group
of fields.
SIZE
Specifies the size or precision of the field according
to its type. For example, to specify a field with a
visible width of 24 characters:
INPUT TYPE=text SIZE="24"
SRC
A URL or URN specifying an image. For use only with
TYPE=IMAGE in HTML Level 2.
TYPE
Defines the type of data the field accepts. Defaults to
free text. Several types of fields can be defined with
the type attribute:
CHECKBOX
Used for simple Boolean attributes, or for attributes
that can take multiple values at the same time. The
latter is represented by a number of checkbox fields
each of which has the same name. Each selected checkbox
generates a separate name/value pair in the submitted
data, even if this results in duplicate names. The
default value for checkboxes is "on".
HIDDEN
No field is presented to the user, but the content of
the field is sent with the submitted form. This value
may be used to transmit state information about
client/server interaction.
IMAGE
An image field upon which you can click with a pointing
device, causing the form to be immediately submitted.
The coordinates of the selected point are measured in
pixel units from the upper-left corner of the image, and
are returned (along with the other contents of the form)
in two name/value pairs. The x-coordinate is submitted
under the name of the field with .x appended, and the y-
Berners-Lee, Connolly, et. al. Page 40
HTML 2.0 February 8, 1995
coordinate is submitted under the name of the field with
.y appended. Any VALUE attribute is ignored. The image
itself is specified by the SRC attribute, exactly as for
the Image element.
NOTE: In a future version of the HTML specification, the
IMAGE functionality may be folded into an enhanced
SUBMIT field.
PASSWORD is the same as the TEXT attribute, except that
text is not displayed as it is entered.
RADIO is used for attributes that accept a single value
from a set of alternatives. Each radio button field in
the group should be given the same name. Only the
selected radio button in the group generates a
name/value pair in the submitted data. Radio buttons
require an explicit VALUE attribute.
RESET is a button that when pressed resets the form's
fields to their specified initial values. The label to
be displayed on the button may be specified just as for
the SUBMIT button.
SUBMIT is a button that when pressed submits the form.
You can use the VALUE attribute to provide a non-
editable label to be displayed on the button. The
default label is application-specific. If a SUBMIT
button is pressed in order to submit the form, and that
button has a NAME attribute specified, then that button
contributes a name/value pair to the submitted data.
Otherwise, a SUBMIT button makes no contribution to the
submitted data.
TEXT is used for a single line text entry fields. Use in
conjunction with the SIZE and MAXLENGTH attributes. Use
the Textarea element for text fields which can accept
multiple lines.
VALUE
The initial displayed value of the field, if it displays
a textual or numerical value; or the value to be
returned when the field is selected, if it displays a
Boolean value. This attribute is required for radio
buttons.
2.15.4 Option
Berners-Lee, Connolly, et. al. Page 41
HTML 2.0 February 8, 1995
Level 2
The Option element can only occur within a Select
element. It represents one choice, and can take these
attributes:
DISABLED
Proposed.
SELECTED
Indicates that this option is initially selected.
VALUE
When present indicates the value to be returned if this
option is chosen. The returned value defaults to the
contents of the Option element.
The contents of the Option element is presented to the
user to represent the option. It is used as a returned
value if the VALUE attribute is not present.
2.15.5 Select
...
Level 2
The Select element allows the user to chose one of a set
of alternatives described by textual labels. Every
alternative is represented by the Option element.
Attributes are:
ERROR
Proposed.
MULTIPLE
The MULTIPLE attribute is needed when users are allowed
to make several selections, e.g. .
NAME
Specifies the name that will submitted as a name/value
pair.
Berners-Lee, Connolly, et. al. Page 42
HTML 2.0 February 8, 1995
SIZE
Specifies the number of visible items. If this is
greater than one, then the resulting form control will
be a list.
The Select element is typically rendered as a pull down
or pop-up list. For example:
Vanilla
Strawberry
Rum and Raisin
Peach and Orange
If no option is initially marked as selected, then the
first item listed is selected.
2.15.6 Text Area
...
Level 2
The Textarea element lets users enter more than one line
of text. For example:
HaL Computer Systems
1315 Dell Avenue
Campbell, California 95008
The text up to the end tag () is used to
initialize the field's value. This end tag is always
required even if the field is initially blank. When
submitting a form, lines in a TEXTAREA should be
terminated using CR/LF.
In a typical rendering, the ROWS and COLS attributes
determine the visible dimension of the field in
characters. The field is rendered in a fixed-width font.
HTML user agents should allow text to extend beyond
these limits by scrolling as needed.
NOTE: In the initial design for forms, multi-line text
fields were supported by the Input element with
TYPE=TEXT. Unfortunately, this causes problems for
Berners-Lee, Connolly, et. al. Page 43
HTML 2.0 February 8, 1995
fields with long text values. SGML's default (Reference
Quantity Set) limits the length of attribute literals to
only 240 characters. The HTML 2.0 SGML declaration
increases the limit to 1024 characters.
2.16 Character Data
Level 0
The characters between HTML tags represent text. A HTML document
(including tags and text) is encoded using the coded character
set specified by the "charset" parameter of the "text/html"
media type. For levels defined in this specification, the
"charset" parameter is restricted to "US-ASCII" or "ISO-8859-1".
ISO-8859-1 encodes a set of characters known as Latin Alphabet
No. 1, or simply Latin-1. Latin-1 includes characters from most
Western European languages, as well as a number of control
characters. Latin-1 also includes a non-breaking space, a soft
hyphen indicator, 93 graphical characters, 8 unassigned
characters, and 25 control characters.
Because non-breaking space and soft hyphen indicator are
not recognized and interpreted by all HTML user agents,
their use is discouraged.
There are 58 character positions occupied by control
characters. See Section 2.16.2 for details on the
interpretation of control characters.
Because certain special characters are subject to
interpretation and special processing, information
providers and HTML user agent implementors should follow
the guidelines in Section 2.16.1.
In addition, HTML provides
character entity references (see Section 2.17.2) and
numerical character references (see Section 2.17.3) to
facilitate the entry and interpretation of characters by
name and by numerical position.
Because certain characters will be interpreted as
markup, they must be represented by entity references as described
in Section 2.16.3 and Section 2.16.4.
2.16.1 Special Characters
Certain characters have special meaning in HTML
documents. There are two printing characters which may
be interpreted by an HTML application to have an effect
of the format of the text:
Berners-Lee, Connolly, et. al. Page 44
HTML 2.0 February 8, 1995
Space
- Interpreted as a word space (place where a line can
be broken) in all contexts except the Preformatted Text
element.
- Interpreted as a nonbreaking space within the
Preformatted Text element.
Hyphen
- Interpreted as a hyphen glyph in all contexts
- Interpreted as a potential word space by
hyphenation engine
2.16.2 Control Characters
Control characters are non-printable characters that are
typically used for communication and device control, as
format effectors, and as information separators.
In SGML applications, the use of control characters is
limited in order to maximize the chance of successful
interchange over heterogenous networks and operating
systems. In HTML, only three control characters are
used: Horizontal Tab (HT, encoded as 9 decimal
in US-ASCII and ISO-8859-1), Carriage Return, and
Line Feed.
Horizontal Tab is interpreted as a word space in all contexts
except preformatted text. Within preformatted text, the tab
should be interpreted to shift the horizontal column position
to the next position which is a multiple of 8 on the same
line; that is, col := (col+8) mod 8.
Carriage Return and Line Feed are conventionally used
to represent end of line. For Internet Media Types defined as
"text/*", the sequence CR LF is used to represent an end of
line. In practice, text/html documents are frequently
represented and transmitted using an end of line convention
that depends on the conventions of the source of the
document; frequently, that representation consists of CR
only, LF only, or CR LF combination. In HTML, end of line in
any of its variations is interpreted as a word space in all
contexts except preformatted text. Within preformatted text,
HTML interpreting agents should expect to treat any of the
three common representations of end-of-line as starting
a new line.
Berners-Lee, Connolly, et. al. Page 45
HTML 2.0 February 8, 1995
2.16.3 Numeric Character References
In addition to any mechanism by which characters may be
represented by the encoding of the HTML document, it is
possible to explicitly reference the printing characters of
the ISO-8859-1 character encoding using a numeric character
reference. See Section
2.17.1 for a list of the characters, their names and
input syntax.
Two reasons for using a numeric character reference:
- the keyboard does not provide a key for the
character, such as on U.S. keyboards which do not
provide European characters
- the character may be interpreted as SGML coding,
such as the ampersand (&), double quotes ("), the lesser
(<) and greater (>) characters
Numeric character references are represented in an HTML
document as SGML entities whose name is number sign (#)
followed by a numeral from 32-126 and 161-255. The HTML
DTD includes a numeric character for each of the
printing characters of the ISO-8859-1 encoding, so that one
may reference them by number if it is inconvenient to enter
them directly:
the ampersand (&), double quotes ("),
lesser (<) and greater (>) characters
2.16.4 Character Entities
In addition, many of the Latin alphabet No. 1 set of printing
characters may be represented within the text of an HTML
document by a character entity. See 2.17.2 for a list of
the characters, names, input syntax, and descriptions.
See 5.2.1 for the SGML entity definitions of "Added
Latin 1 for HTML".
Two reasons for using a character entity:
- the keyboard does not provide a key for the
character, such as on U.S. keyboards which do not
provide European characters
- the character may be interpreted as SGML coding,
such as the ampersand (&), double quotes ("), the lesser
(<) and greater (>) characters
Berners-Lee, Connolly, et. al. Page 46
HTML 2.0 February 8, 1995
A character entity is represented in an HTML document as
an SGML entity whose name is defined in the HTML DTD.
The HTML DTD includes a character entity for each of the
SGML markup characters and for each of the printing
characters in the upper half of Latin-1, so that one may
reference them by name if it is inconvenient to enter
them directly:
the ampersand (&), double quotes ("),
lesser (<) and greater (>) characters
Kurt Gödel was a famous logician and mathematician.
NOTE: To ensure that a string of characters is not
interpreted as markup, represent all occurrences of <,
>, and & by character or entity references.
NOTE: There are SGML features, CDATA and RCDATA, to
allow most <, >, and & characters to be entered without
the use of entity or character references. Because these
features tend to be used and implemented inconsistently,
and because they require 8-bit characters to represent
non-ASCII characters, they are not used in this version
of the HTML DTD. An earlier HTML specification included
an Example element () whose syntax is not
expressible in SGML. No markup was recognized inside of
the Example element except the end tag. While
HTML user agents are encouraged to support this idiom,
its use is deprecated.
2.17 Character Entity Sets
The following entity names are used in HTML, always
prefixed by ampersand (&) and followed by a semicolon as
shown.
They represent particular graphic characters which have
special meanings in places in the markup, or may not be
part of the character set available to the writer.
2.17.1 Numeric and Special Graphic Entities
The following table lists each of the supported
characters specified in the Numeric and Special Graphic
entity set, along with its name, syntax for use, and
description. This list is derived from ISO Standard
8879:1986//ENTITIES Numeric and Special Graphic//EN
however HTML does not provide support for the entire
entity set. Only the entities listed below are
Berners-Lee, Connolly, et. al. Page 47
HTML 2.0 February 8, 1995
supported.
GLYPH NAME SYNTAX DESCRIPTION
< lt < Less than sign
> gt > Greater than sign
& amp & Ampersand
" quot " Double quote sign
2.17.2 ISO Latin 1 Character Entities
The following table lists each of the characters
specified in the Added Latin 1 entity set, along with
its name, syntax for use, and description. This list is
derived from ISO Standard 8879:1986//ENTITIES Added
Latin 1//EN. HTML supports the entire entity set.
NAME SYNTAX DESCRIPTION
Aacute Á Capital A, acute accent
Agrave À Capital A, grave accent
Acirc  Capital A, circumflex accent
Atilde à Capital A, tilde
Aring Å Capital A, ring
Auml Ä Capital A, dieresis or umlaut mark
AElig Æ Capital AE dipthong (ligature)
Ccedil Ç Capital C, cedilla
Eacute É Capital E, acute accent
Egrave È Capital E, grave accent
Ecirc Ê Capital E, circumflex accent
Euml Ë Capital E, dieresis or umlaut mark
Iacute Í Capital I, acute accent
Igrave Ì Capital I, grave accent
Icirc Î Capital I, circumflex accent
Iuml Ï Capital I, dieresis or umlaut mark
ETH Ð Capital Eth, Icelandic
Ntilde Ñ Capital N, tilde
Oacute Ó Capital O, acute accent
Ograve Ò Capital O, grave accent
Ocirc Ô Capital O, circumflex accent
Otilde Õ Capital O, tilde
Ouml Ö Capital O, dieresis or umlaut mark
Oslash Ø Capital O, slash
Uacute Ú Capital U, acute accent
Ugrave Ù Capital U, grave accent
Ucirc Û Capital U, circumflex accent
Uuml Ü Capital U, dieresis or umlaut mark
Yacute Ý Capital Y, acute accent
THORN Þ Capital THORN, Icelandic
szlig ß Small sharp s, German (sz ligature)
Berners-Lee, Connolly, et. al. Page 48
HTML 2.0 February 8, 1995
aacute á Small a, acute accent
agrave à Small a, grave accent
acirc â Small a, circumflex accent
atilde ã Small a, tilde
aring å Small a, ring
auml ä Small a, dieresis or umlaut mark
aelig æ Small ae dipthong (ligature)
ccedil ç Small c, cedilla
eacute é Small e, acute accent
egrave è Small e, grave accent
ecirc ê Small e, circumflex accent
euml ë Small e, dieresis or umlaut mark
iacute í Small i, acute accent
igrave ì Small i, grave accent
icirc î Small i, circumflex accent
iuml ï Small i, dieresis or umlaut mark
eth ð Small eth, Icelandic
ntilde ñ Small n, tilde
oacute ó Small o, acute accent
ograve ò Small o, grave accent
ocirc ô Small o, circumflex accent
otilde õ Small o, tilde
ouml ö Small o, dieresis or umlaut mark
oslash ø Small o, slash
uacute ú Small u, acute accent
ugrave ù Small u, grave accent
ucirc û Small u, circumflex accent
uuml ü Small u, dieresis or umlaut mark
yacute ý Small y, acute accent
thorn þ Small thorn, Icelandic
yuml ÿ Small y, dieresis or umlaut mark
2.17.3 Numerical Character References
This list, sorted numerically, is derived from ISO-8859-1
8-bit single-byte coded graphic character set:
REFERENCE DESCRIPTION
� - Unused
Horizontal tab
Line feed
- Unused
Space
! Exclamation mark
" Quotation mark
# Number sign
$ Dollar sign
Berners-Lee, Connolly, et. al. Page 49
HTML 2.0 February 8, 1995
% Percent sign
& Ampersand
' Apostrophe
( Left parenthesis
) Right parenthesis
* Asterisk
+ Plus sign
, Comma
- Hyphen
. Period (fullstop)
/ Solidus (slash)
0 - 9 Digits 0-9
: Colon
; Semi-colon
< Less than
= Equals aign
> Greater than
? Question mark
@ Commercial at
A - Z Letters A-Z
[ Left square bracket
\ Reverse solidus (backslash)
] Right square bracket
^ Caret
_ Horizontal bar
` Acute accent
a - z Letters a-z
{ Left curly brace
| Vertical bar
} Right curly brace
~ Tilde
- Unused
¡ Inverted exclamation
¢ Cent sign
£ Pound sterling
¤ General currency sign
¥ Yen sign
¦ Broken vertical bar
§ Section sign
¨ Umlaut (dieresis)
© Copyright
ª Feminine ordinal
Berners-Lee, Connolly, et. al. Page 50
HTML 2.0 February 8, 1995
« Left angle quote, guillemotleft
¬ Not sign
Soft hyphen
® Registered trademark
¯ Macron accent
° Degree sign
± Plus or minus
² Superscript two
³ Superscript three
´ Acute accent
µ Micro sign
¶ Paragraph sign
· Middle dot
¸ Cedilla
¹ Superscript one
º Masculine ordinal
» Right angle quote, guillemotright
¼ Fraction one-fourth
½ Fraction one-half
¾ Fraction three-fourths
¿ Inverted question mark
À Capital A, acute accent
Á Capital A, grave accent
 Capital A, circumflex accent
à Capital A, tilde
Ä Capital A, ring
Å Capital A, dieresis or umlaut mark
Æ Capital AE dipthong (ligature)
Ç Capital C, cedilla
È Capital E, acute accent
É Capital E, grave accent
Ê Capital E, circumflex accent
Ë Capital E, dieresis or umlaut mark
Ì Capital I, acute accent
Í Capital I, grave accent
Î Capital I, circumflex accent
Ï Capital I, dieresis or umlaut mark
Ð Capital Eth, Icelandic
Ñ Capital N, tilde
Ò Capital O, acute accent
Ó Capital O, grave accent
Ô Capital O, circumflex accent
Õ Capital O, tilde
Ö Capital O, dieresis or umlaut mark
× Multiply sign
Ø Capital O, slash
Ù Capital U, acute accent
Berners-Lee, Connolly, et. al. Page 51
HTML 2.0 February 8, 1995
Ú Capital U, grave accent
Û Capital U, circumflex accent
Ü Capital U, dieresis or umlaut mark
Ý Capital Y, acute accent
Þ Capital THORN, Icelandic
ß Small sharp s, German (sz ligature)
à Small a, acute accent
á Small a, grave accent
â Small a, circumflex accent
ã Small a, tilde
ä Small a, dieresis or umlaut mark
å Small a, ring
æ Small ae dipthong (ligature)
ç Small c, cedilla
è Small e, acute accent
é Small e, grave accent
ê Small e, circumflex accent
ë Small e, dieresis or umlaut mark
ì Small i, acute accent
í Small i, grave accent
î Small i, circumflex accent
ï Small i, dieresis or umlaut mark
ð Small eth, Icelandic
ñ Small n, tilde
ò Small o, acute accent
ó Small o, grave accent
ô Small o, circumflex accent
õ Small o, tilde
ö Small o, dieresis or umlaut mark
÷ Division sign
ø Small o, slash
ù Small u, acute accent
ú Small u, grave accent
û Small u, circumflex accent
ü Small u, dieresis or umlaut mark
ý Small y, acute accent
þ Small thorn, Icelandic
ÿ Small y, dieresis or umlaut mark
3. Security Considerations
Anchors, embedded images, and all other elements which
contain URIs as parameters may cause the URI to be
dereferenced in response to user input. In this case,
the security considerations of the URI specification
apply.
Berners-Lee, Connolly, et. al. Page 52
HTML 2.0 February 8, 1995
Documents may be constructed whose visible contents
mislead the reader to follow a link to unsuitable or
offensive material.
4. Obsolete and Proposed Features
4.1 Obsolete Features
This section describes elements that are no longer part
of HTML. Client implementors should implement these
obsolete elements for compatibility with previous
versions of the HTML specification.
4.1.1 Comment
The Comment element is used to delimit unneeded text and
comments. The Comment element has been introduced in
some HTML applications but should be replaced by the
SGML comment feature in new HTML user agents (see
Section 2.6.5).
4.1.2 Highlighted Phrase
The Highlighted Phrase element () should be ignored
if not implemented. This element has been replaced by
more meaningful elements (see Section 2.9).
Example of use:
first highlighted phrase non
highlighted textsecond highlighted
phrase etc.
4.1.3 Plain Text
The Plain Text element is used to terminates the HTML
entity and to indicate that what follows is not SGML
which does not require parsing. Instead, an old HTTP
convention specified that what followed was an ASCII
(MIME "text/plain") body. Its presence is an
optimization. There is no closing tag.
Example of use:
0001 This is line one of a long listing
Berners-Lee, Connolly, et. al. Page 53
HTML 2.0 February 8, 1995
0002 file from which is sent
4.1.4 Example and Listing
... and ...
The Example element and Listing element have been
replaced by the Preformatted Text element.
These styles allow text of fixed-width characters to be
embedded absolutely as is into the document. The syntax
is:
...
or
...
The text between these tags is typically rendered in a
monospaced font so that any formatting done by character
spacing on successive lines will be maintained.
Between the opening and closing tags:
- The text may contain any ISO Latin-1 printable
characters, expect for the end tag opener. The Example
and Listing elements have historically used
specifications which do not conform to SGML.
Specifically, the text may contain ISO Latin printable
characters, including the tag opener, as long it they
does not contain the closing tag in full.
- SGML does not support this form. HTML user agents
may vary on how they interpret other tags within Example
and Listing elements.
- Line boundaries within the text are rendered as a
move to the beginning of the next line, except for one
immediately following a start tag or immediately
preceding an end tag.
- The horizontal tab character must be
interpreted as the smallest positive nonzero number of
spaces which will leave the number of characters so far
on the line as a multiple of 8. Its use is not
Berners-Lee, Connolly, et. al. Page 54
HTML 2.0 February 8, 1995
recommended.
The Listing element is rendered so that at least 132
characters fit on a line. The Example element is
rendered to that at least 80 characters fit on a line
but is otherwise identical to the Listing element.
4.2 Proposed Features
This section describes proposed HTML elements and
entities that are not currently supported under HTML
Levels 0, 1, or 2, but may be supported in the future.
4.2.1 Defining Instance
...
The Defining Instance element indicates the defining
instance of a term. The typical rendering is bold or
bold italic. This element is not widely supported.
4.2.2 Special Characters
To indicate special characters, HTML uses entity or
numeric representations. Additional character
presentations are proposed:
CHARACTER REPRESENTATION
Non-breaking space
Soft-hyphen
Registered ®
Copyright ©
4.2.3 Strike
...
The Strike element is proposed to indicate
strikethrough, a font style in which a horizontal line
appears through characters. This element is not widely
supported.
4.2.4 Underline
...
The Underline element is proposed to indicate that the
text should be rendered as underlined. This proposed tag
is not supported by all HTML user agents.
Berners-Lee, Connolly, et. al. Page 55
HTML 2.0 February 8, 1995
Example of use:
The text shown here is rendered in the document
as underlined.
5. HTML Document Type Definitions
5.1 SGML Declaration for HTML
This is the SGML Declaration for HyperText Markup Language
(HTML) as used by the World Wide Web (WWW) application:
5.1.1 Sample SGML Open Style Entity Catalog for HTML
Berners-Lee, Connolly, et. al. Page 57
HTML 2.0 February 8, 1995
The SGML standard describes an "entity manager" as the
portion or component of an SGML system that maps SGML
entities into the actual storage model (e.g., the file
system). The standard itself does not define a particular
mapping methodology or notation.
To assist the interoperability among various SGML tools and
systems, the SGML Open consortium has passed a technical
resolution that defines a format for an
application-independent entity catalog that maps external
identifiers and/or entity names to file names.
Each entry in the catalog associates a storage object
identifier (such as a file name) with information about the
external entity that appears in the SGML document. In
addition to entries that associate public identifiers, a
catalog entry can associate an entity name with a storage
object indentifier. For example, the following are
possible catalog entries:
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN" "iso-lat1.gml"
PUBLIC "-//ACME DTD Writers//DTD General Report//EN" report.dtd
ENTITY "graph1" "graphics\graph1.cgm"
In particular, the following shows entries relevant to HTML.
-- catalog: SGML Open style entity catalog for HTML --
-- $Id: catalog,v 1.1 1994/10/07 21:35:07 connolly Exp $ --
-- Ways to refer to Level 2: most general to most specific --
PUBLIC "-//IETF//DTD HTML//EN" html.dtd
PUBLIC "-//IETF//DTD HTML//EN//2.0" html.dtd
PUBLIC "-//IETF//DTD HTML Level 2//EN" html.dtd
PUBLIC "-//IETF//DTD HTML Level 2//EN//2.0" html.dtd
-- Ways to refer to Level 1: most general to most specific --
PUBLIC "-//IETF//DTD HTML Level 1//EN" html-1.dtd
PUBLIC "-//IETF//DTD HTML Level 1//EN//2.0" html-1.dtd
-- Ways to refer to Level 0: most general to most specific --
PUBLIC "-//IETF//DTD HTML Level 0//EN" html-0.dtd
PUBLIC "-//IETF//DTD HTML Level 0//EN//2.0" html-0.dtd
-- ISO latin 1 entity set for HTML --
PUBLIC "-//IETF//ENTITIES Added Latin 1//EN" ISOlat1.sgml
5.2 HTML DTD
This is the Document Type Definition for the
HyperText Markup Language (HTML DTD):
Berners-Lee, Connolly, et. al. Page 58
HTML 2.0 February 8, 1995
...
--
>
]]>
Berners-Lee, Connolly, et. al. Page 59
HTML 2.0 February 8, 1995
%ISOlat1;
Berners-Lee, Connolly, et. al. Page 60
HTML 2.0 February 8, 1995
Berners-Lee, Connolly, et. al. Page 61
HTML 2.0 February 8, 1995
]]>
]]>
Heading
is preferred to
Heading
-->
]]>
"
>
#AttVal(Alt)"
>
]]>
Berners-Lee, Connolly, et. al. Page 64
HTML 2.0 February 8, 1995
]]>
Berners-Lee, Connolly, et. al. Page 65
HTML 2.0 February 8, 1995
]]>
Directory"
>
Menu"
>
Heading
Text ...
is preferred to
Heading
Text ...
-->
]]>
Berners-Lee, Connolly, et. al. Page 67
HTML 2.0 February 8, 1995
Form:"
%SDASUFF; "Form End. "
>
Select #AttVal(Multiple)"
>
]]>
]]>
Berners-Lee, Connolly, et. al. Page 69
HTML 2.0 February 8, 1995
" >
[Document is indexed/searchable.]">
]]>
5.2.1 ISO Latin 1 Definitions for HTML
Berners-Lee, Connolly, et. al. Page 71
HTML 2.0 February 8, 1995
Berners-Lee, Connolly, et. al. Page 72
HTML 2.0 February 8, 1995
5.3 HTML Level 0 DTD
This is the Document Type Definition for the HyperText
Markup Language as used by minimally conforming World Wide
Web applications (HTML Level 0 DTD):
...
--
>
%html;
5.4 HTML Level 1 DTD
This is the Document Type Definition for the HyperText
Markup Language with Level 1 Extensions (HTML Level 1 DTD):
Berners-Lee, Connolly, et. al. Page 73
HTML 2.0 February 8, 1995
...
--
>
%html;
7. Glossary
The HTML specification uses these words with precise
meanings:
attribute
A syntactical component of an HTML element which is
often used to specify a characteristic quality of an
element, other than type or content.
document type definition (DTD)
A DTD is a collection of declarations (entity, element,
attribute, link, map, etc.) in SGML syntax that defines
the components and structures available for a class
(type) of documents.
element
Berners-Lee, Connolly, et. al. Page 74
HTML 2.0 February 8, 1995
A component of the hierarchical structure defined by the
document type definition; it is identified in a document
instance by descriptive markup, usually a start-tag and
an end-tag.
HTML
HyperText Markup Language.
HTML user agent
Any tool used with HTML documents.
HTML document
A collection of information represented as a sequence of
characters. An HTML document consists of data characters
and markup. In particular, the markup describes a
structure conforming to the HTML document type
definition.
HTTP
A generic stateless object-oriented protocol, which may
be used for many similar tasks by extending the
commands, or "methods", used. For example, you might use
HTTP for name servers and distributed object-oriented
systems, With HTTP, the negotiation of data
representation allows systems to be built independent of
the development of new representations. For more
information see:
http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTTP2.html
(document) instance
The document itself including the actual content with
the actual markup. Can be a single document or part of a
document instance set that follows the DTD.
markup
Text added to the data of a document to convey
information about it. There are four different kinds of
markup: descriptive markup (tags), references, markup
declarations, and processing instructions.
Multipurpose Internet Mail Extensions (MIME)
An extension to Internet email which provides the
Berners-Lee, Connolly, et. al. Page 75
HTML 2.0 February 8, 1995
ability to transfer non-textual data, such as graphics,
audio and fax. It is defined in RFC 1341.
representation
The encoding of information for interchange. For
example, HTML is a representation of hypertext.
rendering
Formatting and presenting information.
SGML
Standard Generalized Markup Language is a data encoding
that allows the information in documents to be shared -
either by other document publishing systems or by
applications for electronic delivery, configuration
management, database management, inventory control, etc.
Defined in ISO 8879:1986 Information Processing Text and
Office Systems; Standard Generalized Markup Language
(SGML).
SGMLS
An SGML parser by James Clark, jjc@jclark.com, derived
from the ARCSGML parser materials which were written by
Charles F. Goldfarb. The source is available at
ftp.ifi.uio.no/pub/SGML/SGMLS.
tag
Descriptive markup. There are two kinds of tags; start-
tags and end-tags.
URI
Universal Resource Identifiers (URIs) is the name for a
generic WWW identifier. The URI specification simply
defines the syntax for encoding arbitrary naming or
addressing schemes, and has a list of such schemes. See
also: http://info.cern.ch/hypertext/WWW/Addressing/Addressing.html
WWW
A hypertext-based, distributed information system
created by researchers at CERN in Switzerland. Users may
create, edit or browse hypertext documents. The clients
and servers are freely available.See also:
http://info.cern.ch/hypertext/WWW/TheProject.html
Berners-Lee, Connolly, et. al. Page 76
HTML 2.0 February 8, 1995
7.1 Imperatives
may
The implementation is not obliged to follow this in any
way.
must
If this is not followed, the implementation does not
conform to this specification.
shall
If this is not followed, the implementation does not
conform to this specification.
should
If this is not followed, though the implementation
officially conforms to the specification, undesirable
results may occur in practice.
typical
Typical rendering is described for many elements. This
is not a mandatory part of the specification but is
given as guidance for designers and to help explain the
uses for which the elements were intended.
8. References
The HTML specification cites these works:
HTTP
HTTP: A Protocol for Networked Information. This
document is available at
http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTTP2.h
tml.
MIME
N. Borenstein, N. Freed, MIME (Multipurpose Internet
Mail Extensions) Part One: Mechanisms for Specifying and
Describing the Format of Internet Message Bodies,
09/23/1993. (Pages=81) (Format=.txt, .ps) (Obsoletes
RFC1341) (Updated by RFC1590).
Berners-Lee, Connolly, et. al. Page 77
HTML 2.0 February 8, 1995
SGML
ISO Standard 8879:1986 Information Processing Text and
Office Systems; Standard Generalized Markup Language
(SGML).
SGMLS
An SGML parser by James Clark, jjc@jclark.com, derived
from the ARCSGML parser materials which were written by
Charles F. Goldfarb. The source is available at
ftp.ifi.uio.no/pub/SGML/SGMLS.
URI
Universal Resource Identifiers. Available by anonymous
FTP as Postscript (info.cern.ch/pub/www/doc/url.ps) or
text (info.cern.ch/pub/www/doc/url.txt)
WWW
The World Wide Web , a global information initiative.
For bootstrap information, telnet info.cern.ch or find
documents by ftp://info.cern.ch/pub/www/doc.
9. Acknowledgments
The HTML document type was designed by Tim Berners-Lee
at CERN as part of the 1990 World Wide Web project. In
1992, Dan Connolly wrote the HTML Document Type
Definition (DTD) and a brief HTML specification.
Since 1993, a wide variety of Internet participants have
contributed to the evolution of HTML, which has included
the addition of in-line images introduced by the NCSA
Mosaic software for WWW. Dave Raggett played an
important role in deriving the FORMS material from the
HTML+ specification.
Dan Connolly and Karen Olson Muldrow rewrote the HTML
Specification in 1994.
Special thanks to the many people who have contributed
to this specification:
- Terry Allen; O'Reilly & Associates; terry@ora.com
- Marc Andreessen; Netscape Communications Corp;
marca@mcom.com
Berners-Lee, Connolly, et. al. Page 78
HTML 2.0 February 8, 1995
- Paul Burchard; The Geometry Center, University of
Minnesota; burchard@geom.umn.edu
- James Clark; jjc@jclark.com
- Daniel W. Connolly; HaL Computer Systems; connolly@hal.com
- Roy Fielding; University of California, Irvine;
fielding@ics.uci.edu
- Peter Flynn; University College Cork, Ireland; pflynn@www.ucc.ie
- Jay Glicksman; Enterprise Integration Technology; jay@eit.com
- Paul Grosso; ArborText, Inc.; paul@arbortext.com
- Eduardo Gutentag; Sun Microsystems; eduardo@Eng.Sun.com
- Bill Hefley; Software Engineering Institute,
Carnegie Mellon University; weh@sei.cmu.edu
- Chung-Jen Ho; Xerox Corporation; cho@xsoft.xerox.com
- Mike Knezovich; Spyglass, Inc.; mike@spyglass.com
- Tim Berners-Lee; CERN; timbl@info.cern.ch
- Tom Magliery; NCSA; mag@ncsa.uiuc.edu
- Murray Maloney; Toronto Development Centre, The
Santa Cruz Operation (SCO); murray@sco.com
- Larry Masinter; Xerox Palo Alto Research Center;
masinter@parc.xerox.com
- Karen Olson Muldrow; HaL Computer Systems; karen@hal.com
- Bill Perry, Spry, Inc., wmperry@spry.com
- Dave Raggett, Hewlett Packard, dsr@hplb.hpl.hp.com
- E. Corprew Reed; Cold Spring Harbor Laboratory; corp@cshl.org
- Yuri Rubinsky; SoftQuad, Inc.; yuri@sq.com
- Eric Schieler; Spyglass, Inc.; eschieler@spyglass.com
- James L. Seidman; Spyglass, Inc.; jim@spyglass.com
- Eric W. Sink; Spyglass, Inc.; eric@spyglass.com
Berners-Lee, Connolly, et. al. Page 79
HTML 2.0 February 8, 1995
- Stuart Weibel; OCLC Office of Research; weibel@oclc.org
- Chris Wilson; Spry, Inc.; cwilson@spry.com
10. Author's Addresses
Tim Berners-Lee
timbl@quag.lcs.mit.edu
Daniel W. Connolly
Hal Software Systems
3006A Longhorn Blvd.
Austin, TX 78758
phone: (512) 834-9962 extension 5010
fax: (512) 823-9963
URL: http://www.hal.com/~connolly
email: connolly@hal.com
Berners-Lee, Connolly, et. al. Page 80